Invariant determination

ABSTRACT

Examples disclosed herein relate to determining that an operation is accessing data on a persistent memory and retrieving a log of the operation. The examples may also include determining a type of the data being accessed by the persistent memory by the operation and identifying, from the log, a location in the persistent memory of the data accessed by the operation. The examples may also include determining contents of the data accessed by the persistent memory by the operation and determining whether the contents of the data hold an invariant corresponding to the type of data.

BACKGROUND

Persistent memory enables programs to persist in-memory data structuresdirectly on byte-addressable non-volatile memory (NVM) for low latency.However, this may lead to data structures being more susceptible tosoftware failures that accidentally corrupt the state of the datastructure.

BRIEF DESCRIPTION OF THE DRAWINGS

The following detailed description references the drawings, wherein:

FIG. 1 is a block diagram of an example system for invariantdetermination;

FIG. 2 is a flowchart of an example method for invariant determination;

FIG. 3 is a flowchart of another example method for invariantdetermination; and

FIG. 4 is a flowchart of another example system for invariantdetermination.

DETAILED DESCRIPTION

With persistent memory, programs can persist in-memory data structuresdirectly on byte-addressable non-volatile memory (NVM). An applicationpersists data directly on NVM as it creates and modifies in-memory datastructures, and the application continues to have access to thepersisted data after system restart. The benefit is flexible andlow-latency persistence.

However, as modification of durable state is done through regularload/store memory instructions, durable state may be more susceptible tosoftware failures that accidentally corrupt such state. Applicationsthat rely on pointer integrity may do so within in-memory heap objects.Corruption of the memory heap (accidental or malicious) may lead toapplication crashes or serious security vulnerabilities.

Systems and methods for invariant verification described herein providea framework that enables programmers to express key memory-safetyinvariants, such as no object overlap, correspondence between allocatorand pointers, and reference counts. The framework may check and enforcesuch invariants either at recovery time, or at runtime at specificpoints where consistent invariants are expected to hold true such as attransaction commits.

Systems and methods for invariant verification described herein may uselog files as a failsafe. For example, operations may be split intotransactions and before any transaction is performed, the transactionmay be committed to the log. After each transaction belonging to a givenoperation is committed to the log, then the operation may be performed.In the case of a failure, such as a power outage, the log can bereferenced. Techniques for invariant verification may leverage this log.Specifically, invariants may be checked for transaction in the log aftertransactions have been committed to the log, but before the operationhas been performed.

A method for invariant determination may include determining that anoperation is accessing data on a persistent memory and retrieving a logof the operation. The method may also include determining a type of thedata being accessed by the persistent memory by the operation andidentifying, from the log, a location in the persistent memory of thedata accessed by the operation. The method may also include determiningcontents of the data accessed by the persistent memory by the operationand determining whether the contents of the data hold an invariantcorresponding to the type of data.

FIG. 1 is a block diagram of an example system 100 for invariantdetermination. System 100 may include a processor 102 and a memory 104that may be coupled to each other through a communication link (e.g., abus). Processor 102 may include a single or multiple Central ProcessingUnits (CPU) or another suitable hardware processor(s). In some examples,memory 104 stores machine readable instructions executed by processor102 for system 100. Memory 104 may include any suitable combination ofvolatile and/or non-volatile memory, such as combinations of RandomAccess Memory (RAM), Read-Only Memory (ROM), flash memory, and/or othersuitable memory.

Memory 104 stores instructions to be executed by processor 102 includinginstructions for operation determiner 106, log retriever 108, data typedeterminer 110, location identifier 112, data determiner 114, invarianthandler 116 and/or other components. According to variousimplementations, system 100 may be implemented in hardware and/or acombination of hardware and programming that configures hardware.Furthermore, in FIG. 1 and other Figures described herein, differentnumbers of components or entities than depicted may be used.

System 100 may implement a software-based resilience solution totolerate software bugs and failures that accidentally corrupt durablestate stored in NVM. The solution may leverage the invariantrelationships between data structure and the transactional nature ofcrash consistency mechanisms for NVM.

Some invariants may be generic (e.g. if a pointer of some type isnon-null, the pointer points to an object of that type, or that nowrites stray outside of allocated memory), some invariants may bespecific to a particular data structure (e.g. that a linked list is freeof cycles), and some invariants may be specific to the program itself(e.g. that a data element in one structure has a valid reference to adata element in another structure).

From these invariants (which may apply to the whole memory heap), system100 may derive assertions that apply when one or more objects on theheap are modified. These assertions may be used to verify that theinvariants hold after the given modifications occur.

Processor 102 may execute operation determiner 106 to determine that anoperation is writing data to a persistent memory. The operation may bepart of a program. In some aspects, the original program code may bemodified so that its persistent memory allocations are annotated by thetype of the object they are allocating. In some programming languagesthe object type may be implicit because allocation carries typeinformation. Either way, the type information may be used to determinewhich invariants should be checked when memory is written to, as well asto check invariants concerning the types of objects at a given address.

The original program code may also be automatically instrumented so thatall stores to persistent memory are logged. This log may be used toorder and detect the changes effected by a transaction.

Processor 102 may execute log retriever 108 to retrieve a log of theoperation. Each write to persistent memory may be added to the log byprogram code associated with the operation.

Processor 102 may execute data type determiner 110 to determine a typeof the data being written to the persistent memory by the operation. Thetype of data may be allocated to the persistent memory by a memoryallocator. Data types may include hash tables, linked lists, etc.

For example, data type determiner 110 may identity persistent datastructures in the original program. These structures may encompass thescope of objects that may be statically or dynamically allocated inpersistent memory. Data type determiner 110 may enumerate the structuresor types in a way that allows annotations and invariants to refer totheir types and fields through symbolic constants. For example,operation determiner 110 may create a list of names structure types,each with a distinct integer value assigned to it. Each structure may beassociated with a list of fields, each with a distinct integer assignedto it, and information about types and offsets for the log parsing toturn a log entry from “address, value” into “struct type, field, value”.

Processor 102 may execute location identifier 112 to identify, from thelog, a location in the persistent memory of the data written by theoperation. Processor 102 may execute data determiner 114 to determinecontents of the data written in the persistent memory by the operation.In some aspects, the contents of the data include a structure and datadeterminer 114 may further identify the structure in a program codeassociated with the operation.

Data determiner 114 may translate the addresses of writes (found in thelog) into the identifiers of fields and structures by using metadatarecorded at allocation time. These field identifiers may determine whichinvariants are checked. For example, the invariants may be selected onthe type of data identified (i.e. as described above in reference todata type determiner 110). In other words, each data type may beassociated with a set of invariants. Accordingly, once that data type isidentified, each invariant associated with the data type may be checked.In one example, a type of data may include a dictionary implemented as ahash table. An invariant associated with this data type may be that isthat each entry stored in the table must be placed in the right slot asdetermined by the hash of its key.

In some aspects, the invariant is provided by the programmer, and theframework associates it with certain fields of the dictionary. When theframework detects a change to these certain fields (e.g., size, key),the system may verify that the invariant holds.

The invariants may take into account the new value of the field, the oldvalue of the field, the type of the enclosing structure, an array indexif the field is a member or sub-member of an array, current values ofother fields in the same structure, other writes in the sametransaction, metadata recorded per allocated object (i.e. type, size)and/or other structures accessed either through some global mechanism orthrough references from the modified structure.

Processor 102 may execute invariant handler 116 to determine whether thecontents of the data hold an invariant corresponding to the type ofdata. The invariant may be based on at least one of a data type of thestructure or a current value of a field in the structure other than thecontents. The invariant may be based on at least one of a new value of adata field corresponding to the location or an old value of the datafield corresponding to the location. The invariant may be a localinvariant corresponding to the contents of the data written in thepersistent memory by the operation and the local invariant may beadapted from a global invariant corresponding to the type of the databeing written to the persistent memory by the operation. In other words,a global invariant holds over the entire state of a data structure, asopposed to a local invariant which is localized to the data changed bythe operation. Building on the dictionary example described above inreference to data determiner 114, a global invariant may be each entrystored in the table must be placed in the right slot as determined bythe hash of its key. The local invariant adapted from this globalinvariant may specify, for example, a specific table, slot, hash and/orkey in memory that should follow the invariant.

In some aspects, the operation may be a first operation that is part ofa transaction and the invariant may be based on a second operation thatis also part of the transaction.

Invariant checking may not occur when the write to persistent memoryhappens, but possibly at a later point in time. In other words,invariant handler 116 may determine whether the contents of the datahold an invariant corresponding to the type of data at certainconsistency points. As used herein, a consistency point refers to whenall threads reach a transaction commit point. Checking the invariantswhile a transaction is in progress may result in a “false positive”, astransactions are allowed to (and sometimes, may have to) violateinvariants temporarily.

The invariant handler 116 may allow the operation to proceed when it isdetermined that the invariant is held. The invariant handler 116 mayabort the operation when it is determined that the invariant is notheld.

Referring now to FIGS. 2-3, flow diagrams are illustrated in accordancewith various examples of the present disclosure. The flow diagramsrepresent processes that may be utilized in conjunction with varioussystems and devices as discussed with reference to the precedingfigures, such as, for example, system 100 described in reference to FIG.1 and/or system 400 described in reference to FIG. 4. While illustratedin a particular order, the flow diagrams are not intended to be solimited. Rather, it is expressly contemplated that various processes mayoccur in different orders and/or simultaneously with other processesthan those illustrated. As such, the sequence of operations described inconnection with FIGS. 2-3 are examples and are not intended to belimiting. Additional or fewer operations or combinations of operationsmay be used or may vary without departing from the scope of thedisclosed examples. Thus, the present disclosure merely sets forthpossible examples of implementations, and many variations andmodifications may be made to the described examples.

FIG. 2 is a flowchart of an example method 200 for invariantdetermination. Method 200 may start at block 202 and continue to block204, where the method 200 may include determining that an operation isaccessing data on a persistent memory. At block 206, the method mayinclude retrieving a log of the operation. At block 208, the method mayinclude determining a type of the data being accessed by the persistentmemory by the operation. The type of data may be allocated to thepersistent memory by a memory allocator.

At block 210 the method may include identifying, from the log, alocation in the persistent memory of the data accessed by the operation.Each write to persistent memory may be added to the log by program codeassociated with the operation. At block 212 the method may includedetermining contents of the data accessed by the persistent memory bythe operation. In some aspects, the contents of the data include astructure and the method may include identifying the structure in aprogram code associated with the operation.

At block 214, the method may include determining whether the contents ofthe data hold an invariant corresponding to the type of data. Theinvariant may be based on at least one of a data type of the structureor a current value of a field in the structure other than the contents.The invariant may be based on at least one of a new value of a datafield corresponding to the location or an old value of the data fieldcorresponding to the location. The invariant may be a local invariantcorresponding to the contents of the data written in the persistentmemory by the operation and the local invariant may be adapted from aglobal invariant corresponding to the type of the data being written tothe persistent memory by the operation. In some aspects, the operationmay be a first operation that is part of a transaction and the invariantmay be based on a second operation also part of the transaction. Themethod may continue to block 216, where the method may end.

As described above, the method may include determining whether thecontents of the data hold an invariant corresponding to the type ofdata. This is discussed in further detail below in regards to FIG. 3.

FIG. 3 is a flowchart of an example method 300 for invariantdetermination. Method 300 may start at block 302 and continue to block304, where the method 300 may include determining whether the contentsof the data hold an invariant corresponding to the type of data.

If it is determined that the invariant is not held (NO branch of block304), at block 306, the method may involve aborting the operation. Themethod may continue to block 308, where the method may end. If it isdetermined that the invariant is held (YES branch of block 304), atblock 310, the method may involve allowing the operation to proceed. Themethod may continue to block 312, where the method may end.

FIG. 4 is a block diagram of an example system 400 for invariantdetermination. In the example illustrated in FIG. 4, system 400 includesa processor 402 and a machine-readable storage medium 404. Although thefollowing descriptions refer to a single processor and a singlemachine-readable storage medium, the descriptions may also apply to asystem with multiple processors and multiple machine-readable storagemediums. In such examples, the instructions may be distributed (e.g.,stored) across multiple machine-readable storage mediums and theinstructions may be distributed (e.g., executed by) across multipleprocessors.

Processor 402 may be at least one central processing unit (CPU),microprocessor, and/or other hardware devices suitable for retrieval andexecution of instructions stored in machine-readable storage medium 404.In the example illustrated in FIG. 4, processor 402 may fetch, decode,and execute instructions 406, 408, 410, 412, 414 and 416 to invariantdetermination. Processor 402 may include at least one electronic circuitcomprising a number of electronic components for performing thefunctionality of at least one of the instructions in machine-readablestorage medium 404. With respect to the executable instructionrepresentations (e.g., boxes) described and shown herein, it should beunderstood that part or all of the executable instructions and/orelectronic circuits included within one box may be included in adifferent box shown in the figures or in a different box not shown.

Machine-readable storage medium 404 may be any electronic, magnetic,optical, or other physical storage device that stores executableinstructions. Thus, machine-readable storage medium 404 may be, forexample, Random Access Memory (RAM), an Electrically-ErasableProgrammable Read-Only Memory (EEPROM), a storage drive, an opticaldisc, and the like. Machine-readable storage medium 404 may be disposedwithin system 400, as shown in FIG. 4. In this situation, the executableinstructions may be “installed” on the system 400. Machine-readablestorage medium 404 may be a portable, external or remote storage medium,for example, that allows system 400 to download the instructions fromthe portable/external/remote storage medium. In this situation, theexecutable instructions may be part of an “installation package”. Asdescribed herein, machine-readable storage medium 404 may be encodedwith executable instructions for context aware data backup.

The machine-readable storage medium may be non-transitory. Referring toFIG. 4, operation determine instructions 406, when executed by aprocessor (e.g., 402), may cause system 400 to determine that anoperation is writing data to a persistent memory. Log retrieveinstructions 408, when executed by a processor (e.g., 402), may causesystem 400 to retrieve a log of the operation. Each write to persistentmemory may be added to the log by program code associated with theoperation.

Type determine instructions 410, when executed by a processor (e.g.,402), may cause system 400 to determine a type of the data being writtento the persistent memory by the operation. The type of data may beallocated to the persistent memory by a memory allocator.

Log identify instructions 412, when executed by a processor (e.g., 402),may cause system 400 to identify, from the log, a location in thepersistent memory of the data written by the operation. Translateinstructions 414, when executed by a processor (e.g., 402), may causesystem 400 to translate the location of the operation into an identifierof the data being written to the persistent memory. In some aspects, thecontents of the data include a structure and data determiner 114 mayfurther identify the structure in a program code associated with theoperation. Invariant determine instructions 416, when executed by aprocessor (e.g., 402), may cause system 400 to determine whether thedata holds an invariant corresponding to the type of data. If it isdetermined that the invariant does not hold, invariant determineinstructions 416 may cause system 400 to abort the operation when it isdetermined that the invariant is not held. If it is determined that theinvariant does hold, invariant determine instructions 416 may causesystem 400 to allow the operation to be performed.

The invariant may be based on at least one of a data type of thestructure or a current value of a field in the structure other than thecontents. The invariant may be based on at least one of a new value of adata field corresponding to the location or an old value of the datafield corresponding to the location. The invariant may be a localinvariant corresponding to the contents of the data written in thepersistent memory by the operation and the local invariant may beadapted from a global invariant corresponding to the type of the databeing written to the persistent memory by the operation. In someaspects, the operation may be a first operation that is part of atransaction and the invariant may be based on a second operation alsopart of the transaction.

The foregoing disclosure describes a number of examples for invariantdetermination. The disclosed examples may include systems, devices,computer-readable storage media, and methods for invariantdetermination. For purposes of explanation, certain examples aredescribed with reference to the components illustrated in FIGS. 1-4. Thecontent type of the illustrated components may overlap, however, and maybe present in a fewer or greater number of elements and components.Further, all or part of the content type of illustrated elements mayco-exist or be distributed among several geographically dispersedlocations. Further, the disclosed examples may be implemented in variousenvironments and are not limited to the illustrated examples.

Further, the sequence of operations described in connection with FIGS.1-4 are examples and are not intended to be limiting. Additional orfewer operations or combinations of operations may be used or may varywithout departing from the scope of the disclosed examples. Furthermore,implementations consistent with the disclosed examples need not performthe sequence of operations in any particular order. Thus, the presentdisclosure merely sets forth possible examples of implementations, andmany variations and modifications may be made to the described examples.

The invention claimed is:
 1. A method comprising: determining that anoperation is accessing data on a persistent memory; retrieving a log ofthe operation; determining a type of the data being accessed by thepersistent memory by the operation; identifying, from the log, alocation in the persistent memory of the data accessed by the operation;determining contents of the data accessed by the persistent memory bythe operation; and determining whether the contents of the data hold aninvariant corresponding to the type of data.
 2. The method of claim 1,wherein a memory allocator annotates the type of data allocated to thepersistent memory.
 3. The method of claim 1, wherein a program codeassociated with the operation adds each write to persistent memory tothe log.
 4. The method of claim 1, comprising: aborting the operationwhen it is determined that the invariant is not held.
 5. The method ofclaim 1, comprising: allowing the operation to proceed when it isdetermined that the invariant is held.
 6. The method of claim 1, whereinthe contents of the data include a structure, the method comprising:identifying the structure in a program code associated with theoperation.
 7. The method of claim 1, wherein the invariant is based onat least one of a data type of the structure or a current value of afield in the structure other than the contents.
 8. The method of claim1, wherein the invariant is based on at least one of a new value of adata field corresponding to the location or an old value of the datafield corresponding to the location.
 9. The method of claim 1, whereinthe operation is a first operation that is part of a transaction and theinvariant is based on a second operation part of the transaction. 10.The method of claim 1, wherein the invariant is a local invariantcorresponding to the contents of the data written in the persistentmemory by the operation and the local invariant is adapted from a globalinvariant corresponding to the type of the data being written to thepersistent memory by the operation.
 11. A system comprising: a memorystoring a plurality of instructions; and a hardware processor configuredto execute the plurality of instructions, the hardware processor, whenexecuting the plurality of instructions, is configured to operate as: anoperation determiner to determine that an operation is writing data to apersistent memory; a log retriever to retrieve a log of the operation; adata type determiner to determine a type of the data being written tothe persistent memory by the operation; a location identifier toidentify, from the log, a location in the persistent memory of the datawritten by the operation; a data determiner to determine contents of thedata written in the persistent memory by the operation; and an invarianthandler to determine whether the contents of the data hold an invariantcorresponding to the type of data.
 12. The system of claim 11, whereinthe invariant handler aborts the operation when it is determined thatthe invariant is not held.
 13. The system of claim 11, wherein theinvariant handler allows the operation to proceed when it is determinedthat the invariant is held.
 14. The system of claim 11, wherein the typeof data is allocated to the persistent memory by a memory allocator. 15.The system of claim 11, wherein each write to persistent memory is addedto the log by program code associated with the operation.
 16. Anon-transitory machine-readable storage medium encoded withinstructions, the instructions executable by a processor of a system tocause the system to: determine that an operation is writing data to apersistent memory; retrieve a log of the operation; determine a type ofthe data being written to the persistent memory by the operation;identify, from the log, a location in the persistent memory of the datawritten by the operation; translate the location of the operation intoan identifier of the data being written to the persistent memory; anddetermine whether the data holds an invariant corresponding to the typeof data.
 17. The non-transitory machine-readable storage medium of claim16, comprising instructions to translate the location using metadatarecorded by an allocator that allocates locations to the persistentmemory.
 18. The non-transitory machine-readable storage medium of claim16, comprising instructions to: identify a structure in a program codeassociated with the operation, the structure enclosing the data.
 19. Thenon-transitory machine-readable storage medium of claim 18, wherein theinvariant is based on at least one of a data type of the structure or acurrent value of a field in the structure other than the data.
 20. Thenon-transitory machine-readable storage medium of claim 16, wherein theinvariant is based on at least one of a new value of a data fieldcorresponding to the location or an old value of the data fieldcorresponding to the location.